Unemployment Rates of Civilian Labor Force in U.S., 2015 ("Bokeh" Viz)

Description: Choropleth Map of Unemployment Rates in U.S.

[ACS 5-Year Data (2011-2015) per State & County]

We provide a Choropleth Map showing the Unemployment Rates across the U.S. States and their Counties, concerning however only the Civilian Labor Force which include people from all ages from 18 to 64 years old.

To do so, we use the Subject Tables of 2011-2015 American Community Survey (ACS) 5-Year Data, which are provided by the U.S. Census Bureau. The data dictionary of this data set can be found under the link below:

This is a huge file having more than 66.000 different variables, but having recognized the attributes and measures we are interested in, we can make an appropriate API call and save the information we need in a JSON file.

Information Retrieval / Data Preparation

More specifically, since we are interested to collect the Unemployment Rates of Civilian Labor Force (ages 18-64), in a "State" and in a "County of State" basis, but without breaking down this measure into any other characteristic of this population, we use the 'S2101_C02_034E' variable. We extract and store the table of interest in a JSON file, by making the API call below:

$ curl 'https://api.census.gov/data/2015/acs5/subject?get=YOUR_NAME,S2101_C02_034E&for=county:*&NAME&in=state:*&key=YOUR_KEY_GOES_HERE' \
> -o ./acs5_subj_2015-S2101_C02_034E-counties_in_states.json

The first 3 lines of this JSON file are:

[["S2101_C02_034E","NAME","state","county"],
["6.5","Autauga County, Alabama","01","001"],
["7.4","Baldwin County, Alabama","01","003"],

which can be further prepared by applying the following node.js script:

ndjson-cat ./acs5_subj_2015-S2101_C02_034E-counties_in_states.json \
> | ndjson-split 'd.slice(1)' \
> | ndjson-map '{id:d[2] + d[3], state:d[1].split(",")[1].slice(1), county:d[1].split(",")[0], S2101_C02_034E:d[0]}' \
> > ./acs5_subj_2015-S2101_C02_034E-counties_in_states-id.ndjson

The first 3 lines of this new JSON file are:

{"id":"01001","state":"Alabama","county":"Autauga County","S2101_C02_034E":"6.5"}
{"id":"01003","state":"Alabama","county":"Baldwin County","S2101_C02_034E":"7.4"}
{"id":"01005","state":"Alabama","county":"Barbour County","S2101_C02_034E":"17.8"}

and it will be loaded as a Pandas DataFrame below.

In addition, since we need a geographic definition of U.S. states and counties as of 2015, we download the corresponding "Cartographic Boundary Shapefile" that is provided from the official site of the U.S. Census Bureau. Then, we convert it in a GeoJSON file, and further enrich it with the unemployment rates that were extracted as above.

Note: The U.S. Census Bureau will stop receiving Application Programming Interface (API) calls at http://api.census.gov/data on August 28, 2017 and will instead use HTTPS, so a KEY to access this service will be necessary.

Sources

Loading Libraries and Data Sets

We load the necessary libraries and data sets:


In [1]:
# Required Libraries
import os
import pandas as pd

In [2]:
# Path Definitions of Required Data Sets
unemp_rate_labor_force_NDJSON = os.path.join('/media/ML_HOME/ML-Data_Repository/data/', 'unemp_rate_labor_force.ndjson')
us_counties_GeoJSON = os.path.join('/media/ML_HOME/ML-Data_Repository/maps', 'us_counties-albersUSA-Geo1.json')

In [3]:
# Load the NDJSON file with the measure(s) of interest in a Pandas DataFrame
unemp_rate_labor_force_df = pd.read_json(unemp_rate_labor_force_NDJSON, orient='records', lines=True)
unemp_rate_labor_force_df.head(5)


Out[3]:
S2101_C02_034E county id state
0 6.5 Autauga County 1001 Alabama
1 7.4 Baldwin County 1003 Alabama
2 17.8 Barbour County 1005 Alabama
3 8.3 Bibb County 1007 Alabama
4 7.7 Blount County 1009 Alabama

Next, we exclude the State of "Puerto Rico" from our consideration and print a quick summary statistics of the measure of interest, 'S2101_C02_034E':


In [4]:
unemp_rate_labor_force_df = unemp_rate_labor_force_df[unemp_rate_labor_force_df['state'] != 'Puerto Rico']

In [5]:
unemp_rate_labor_force_df['S2101_C02_034E'].describe()


Out[5]:
count    3142.000000
mean        7.785742
std         3.647364
min         0.000000
25%         5.400000
50%         7.400000
75%         9.700000
max        31.500000
Name: S2101_C02_034E, dtype: float64

Unemployment Rates across the U.S. States & Counties

D3 Choropleth by leveraging the "Bokeh" library

Here, we provide a choropleth map showing the Unemployment Rates of Civilian Labor Force (ages 18-64) across the U.S. States & Counties. To do so, we use the "Bokeh" Python library, and the GeoJSON file which has been produced and enriched with the data of interest, as we described above. "Bokeh" is a Python library for interactive D3 visualizations!


In [6]:
# Load the necessary libraries for the D3 Visualization
from bokeh.io import show, output_notebook
from bokeh.palettes import (
    Blues9 as palette1)
from bokeh.plotting import figure
from bokeh.models import (
    GeoJSONDataSource,
    LogColorMapper,
    HoverTool,
    LogTicker,
    PrintfTickFormatter,
    ColorBar)

# Load the enriched GeoJSON Data Source, with the measures of interest
with open(us_counties_GeoJSON, 'r') as f:
    geo_source = GeoJSONDataSource(geojson=f.read())

# Output the Choropleth Plot in Notebook
output_notebook()

# PROVIDE THE CHOROPLETH OF INTEREST
palette1.reverse()
color_mapper = LogColorMapper(palette=palette1,
                              low=1, 
                              high=35) # Maximum Unemployment Rate in DataFrame: 31.5%

# Define the figure "Tools" we want to make available
TOOLS = "pan, wheel_zoom, reset, hover, save"

# Plot the figure
# Define the figure dimensions and its general details
p = figure(title="Unemployment Rates in U.S. [Civilian Labor Force (18-64 yrs old), 2011-2015]", 
           tools=TOOLS,
           plot_width=960, plot_height=600,
           x_range=(0, 960), y_range=(600, 0),
           x_axis_location=None, y_axis_location=None)
           
# Render the "Bokeh" patches in Glyph
p.patches('xs', 'ys', source=geo_source,
          fill_color={'field': "unemp_rate_labor_force" ,'transform': color_mapper}, 
          fill_alpha=0.7, line_color="white", line_width=0.5)

# Add a Hover Tools over the US States
hover = p.select_one(HoverTool)
hover.point_policy = "follow_mouse"
hover.tooltips = [
    ("State", "@state"),
    ("County", "@county"),
    ("Unemployment Rate", "@unemp_rate_labor_force%"),
    ("(Long, Lat)", "($x, $y)"),
]

# Add a ColorBar Legend
color_bar = ColorBar(color_mapper=color_mapper, ticker=LogTicker(base=5),
                     formatter=PrintfTickFormatter(format="%d%%"),
                     background_fill_alpha=0.7,
                     label_standoff=5, 
                     major_label_text_color='black', 
                     major_tick_line_color='black', major_tick_line_width=1.3, major_tick_out=5,
                     border_line_color=None, location=(0,0),
                     orientation='horizontal', width=600)
p.add_layout(color_bar, 'above')

show(p)


Loading BokehJS ...

In [ ]: